= \REXML Tutorial

== Why \REXML?

- Ruby's \REXML library is part of the Ruby distribution,
  so using it requires no gem installations.
- \REXML is fully maintained.
- \REXML is mature, having been in use for long years.

== To Include, or Not to Include?

REXML is a module.
To use it, you must require it:

  require 'rexml' # => true

If you do not also include it, you must fully qualify references to REXML:

  REXML::Document # => REXML::Document

If you also include the module, you may optionally omit <tt>REXML::</tt>:

  include REXML
  Document # => REXML::Document
  REXML::Document # => REXML::Document

== Preliminaries

All examples here assume that the following code has been executed:

  require 'rexml'
  include REXML

The source XML for many examples here is from file
{books.xml}[https://www.w3schools.com/xml/books.xml] at w3schools.com.
You may find it convenient to open that page in a new tab
(Ctrl-click in some browsers).

Note that your browser may display the XML with modified whitespace
and without the XML declaration, which in this case is:

  <?xml version="1.0" encoding="UTF-8"?>

For convenience, we capture the XML into a string variable:

  require 'open-uri'
  source_string = URI.open('https://www.w3schools.com/xml/books.xml').read

And into a file:

  File.write('source_file.xml', source_string)

Throughout these examples, variable +doc+ will hold only the document
derived from these sources:

  doc = Document.new(source_string)

== Parsing \XML \Source

=== Parsing a Document

Use method REXML::Document::new to parse XML source.

The source may be a string:

  doc = Document.new(source_string)

Or an \IO stream:

  doc = File.open('source_file.xml', 'r') do |io|
    Document.new(io)
  end

Method <tt>URI.open</tt> returns a StringIO object,
so the source can be from a web page:

  require 'open-uri'
  io = URI.open("https://www.w3schools.com/xml/books.xml")
  io.class # => StringIO
  doc = Document.new(io)

For any of these sources, the returned object is an REXML::Document:

  doc       # => <UNDEFINED> ... </>
  doc.class # => REXML::Document

Note: <tt>'UNDEFINED'</tt> is the "name" displayed for a document,
even though <tt>doc.name</tt> returns an empty string <tt>""</tt>.

A parsed document may produce \REXML objects of many classes,
but the two that are likely to be of greatest interest are
REXML::Document and REXML::Element.
These two classes are covered in great detail in this tutorial.

=== Context (Parsing Options)

The context for parsing a document is a hash that influences
the way the XML is read and stored.

The context entries are:

- +:respect_whitespace+: controls treatment of whitespace.
- +:compress_whitespace+: determines whether whitespace is compressed.
- +:ignore_whitespace_nodes+: determines whether whitespace-only nodes are to be ignored.
- +:raw+: controls treatment of special characters and entities.

See {Element Context}[../context_rdoc.html].

== Exploring the Document

An REXML::Document object represents an XML document.

The object inherits from its ancestor classes:

- REXML::Child (includes module REXML::Node)
  - REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
    - REXML::Element (includes module REXML::Namespace).
      - REXML::Document

This section covers only those properties and methods that are unique to a document
(that is, not inherited or included).

=== Document Properties

A document has several properties (other than its children);

- Document type.
- Node type.
- Name.
- Document.
- XPath

[Document Type]

  A document may have a document type:

    my_xml = '<!DOCTYPE foo>'
    my_doc = Document.new(my_xml)
    doc_type = my_doc.doctype
    doc_type.class # => REXML::DocType
    doc_type.to_s  # => "<!DOCTYPE foo>"

[Node Type]

  A document also has a node type (always +:document+):

    doc.node_type # => :document

[Name]

  A document has a name (always an empty string):

    doc.name # => ""

[Document]

  \Method REXML::Document#document returns +self+:

    doc.document == doc # => true

  An object of a different class (\REXML::Element or \REXML::Child)
  may have a document, which is the document to which the object belongs;
  if so, that document will be an \REXML::Document object.

    doc.root.document.class # => REXML::Document

[XPath]

  \method REXML::Element#xpath returns the string xpath to the element,
  relative to its most distant ancestor:

    doc.root.class             # => REXML::Element
    doc.root.xpath             # => "/bookstore"
    doc.root.texts.first       # => "\n\n"
    doc.root.texts.first.xpath # => "/bookstore/text()"

  If there is no ancestor, returns the expanded name of the element:

    Element.new('foo').xpath # => "foo"

=== Document Children

A document may have children of these types:

- XML declaration.
- Root element.
- Text.
- Processing instructions.
- Comments.
- CDATA.

[XML Declaration]

  A document may an XML declaration, which is stored as an REXML::XMLDecl object:

    doc.xml_decl       # => <?xml ... ?>
    doc.xml_decl.class # => REXML::XMLDecl

    Document.new('').xml_decl # => <?xml ... ?>

    my_xml = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>"'
    my_doc = Document.new(my_xml)
    xml_decl = my_doc.xml_decl
    xml_decl.to_s  # => "<?xml version='1.0' encoding='UTF-8' standalone="yes"?>"

  The version, encoding, and stand-alone values may be retrieved separately:

    my_doc.version      # => "1.0"
    my_doc.encoding     # => "UTF-8"
    my_doc.stand_alone? # => "yes"

[Root Element]

  A document may have a single element child, called the _root_ _element_,
  which is stored as an REXML::Element object;
  it may be retrieved with method +root+:

    doc.root           # => <bookstore> ... </>
    doc.root.class     # => REXML::Element

    Document.new('').root # => nil

[Text]

  A document may have text passages, each of which is stored
  as an REXML::Text object:

    doc.texts.each {|t| p [t.class, t] }

  Output:

    [REXML::Text, "\n"]

[Processing Instructions]

  A document may have processing instructions, which are stored
  as REXML::Instruction objects:



  Output:

    [REXML::Instruction, <?p-i my-application ...?>]
    [REXML::Instruction, <?p-i my-application ...?>]

[Comments]

  A document may have comments, which are stored
  as REXML::Comment objects:

    my_xml = <<-EOT
      <!--foo-->
      <!--bar-->
    EOT
    my_doc = Document.new(my_xml)
    my_doc.comments.each {|c| p [c.class, c] }

  Output:

    [REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="foo">]
    [REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="bar">]

[CDATA]

  A document may have CDATA entries, which are stored
  as REXML::CData objects:

    my_xml = <<-EOT
      <![CDATA[foo]]>
      <![CDATA[bar]]>
    EOT
    my_doc = Document.new(my_xml)
    my_doc.cdatas.each {|cd| p [cd.class, cd] }

  Output:

    [REXML::CData, "foo"]
    [REXML::CData, "bar"]

The payload of a document is a tree of nodes, descending from the root element:

  doc.root.children.each do |child|
    p [child, child.class]
  end

Output:

  [REXML::Text, "\n\n"]
  [REXML::Element, <book category='cooking'> ... </>]
  [REXML::Text, "\n\n"]
  [REXML::Element, <book category='children'> ... </>]
  [REXML::Text, "\n\n"]
  [REXML::Element, <book category='web'> ... </>]
  [REXML::Text, "\n\n"]
  [REXML::Element, <book category='web' cover='paperback'> ... </>]
  [REXML::Text, "\n\n"]

== Exploring an Element

An REXML::Element object represents an XML element.

The object inherits from its ancestor classes:

- REXML::Child (includes module REXML::Node)
  - REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
    - REXML::Element (includes module REXML::Namespace).

This section covers methods:

- Defined in REXML::Element itself.
- Inherited from REXML::Parent and REXML::Child.
- Included from REXML::Node.

=== Inside the Element

[Brief String Representation]

  Use method REXML::Element#inspect to retrieve a brief string representation.

    doc.root.inspect # => "<bookstore> ... </>"

  The ellipsis (<tt>...</tt>) indicates that the element has children.
  When there are no children, the ellipsis is omitted:

    Element.new('foo').inspect # => "<foo/>"

  If the element has attributes, those are also included:

    doc.root.elements.first.inspect # => "<book category='cooking'> ... </>"

[Extended String Representation]

  Use inherited method REXML::Child.bytes to retrieve an extended
  string representation.

    doc.root.bytes # => "<bookstore>\n\n<book category='cooking'>\n  <title lang='en'>Everyday Italian</title>\n  <author>Giada De Laurentiis</author>\n  <year>2005</year>\n  <price>30.00</price>\n</book>\n\n<book category='children'>\n  <title lang='en'>Harry Potter</title>\n  <author>J K. Rowling</author>\n  <year>2005</year>\n  <price>29.99</price>\n</book>\n\n<book category='web'>\n  <title lang='en'>XQuery Kick Start</title>\n  <author>James McGovern</author>\n  <author>Per Bothner</author>\n  <author>Kurt Cagle</author>\n  <author>James Linn</author>\n  <author>Vaidyanathan Nagarajan</author>\n  <year>2003</year>\n  <price>49.99</price>\n</book>\n\n<book category='web' cover='paperback'>\n  <title lang='en'>Learning XML</title>\n  <author>Erik T. Ray</author>\n  <year>2003</year>\n  <price>39.95</price>\n</book>\n\n</bookstore>"

[Node Type]

  Use method REXML::Element#node_type to retrieve the node type (always +:element+):

    doc.root.node_type # => :element

[Raw Mode]

  Use method REXML::Element#raw to retrieve whether (+true+ or +nil+)
  raw mode is set.

    doc.root.raw # => nil

[Context]

  Use method REXML::Element#context to retrieve the context hash
  (see {Element Context}[../context_rdoc.html]):

    doc.root.context # => {}

=== Relationships

An element may have:

- Ancestors.
- Siblings.
- Children.

==== Ancestors

[Containing Document]

  Use method REXML::Element#document to retrieve the containing document, if any:

    ele = doc.root.elements.first   # => <book category='cooking'> ... </>
    ele.document                    # => <UNDEFINED> ... </>
    ele = Element.new('foo')        # => <foo/>
    ele.document                    # => nil

[Root Element]

  Use method REXML::Element#root to retrieve the root element:

    ele = doc.root.elements.first   # => <book category='cooking'> ... </>
    ele.root                        # => <bookstore> ... </>
    ele = Element.new('foo')        # => <foo/>
    ele.root                        # => <foo/>

[Root Node]

  Use method REXML::Element#root_node to retrieve the most distant ancestor,
  which is the containing document, if any, otherwise the root element:

    ele = doc.root.elements.first   # => <book category='cooking'> ... </>
    ele.root_node                   # => <UNDEFINED> ... </>
    ele = Element.new('foo')        # => <foo/>
    ele.root_node                   # => <foo/>

[Parent]

  Use inherited method REXML::Child#parent to retrieve the parent

    ele = doc.root                # => <bookstore> ... </>
    ele.parent                    # => <UNDEFINED> ... </>
    ele = doc.root.elements.first # => <book category='cooking'> ... </>
    ele.parent                    # => <bookstore> ... </>

  Use included method REXML::Node#index_in_parent to retrieve the index
  of the element among all of its parents children (not just the element children).
  Note that while the index for <tt>doc.root.elements[n]</tt> is 1-based,
  the returned index is 0-based.

    doc.root.children # =>
      # ["\n\n",
      #  <book category='cooking'> ... </>,
      #  "\n\n",
      #  <book category='children'> ... </>,
      #  "\n\n",
      #  <book category='web'> ... </>,
      #  "\n\n",
      #  <book category='web' cover='paperback'> ... </>,
      #  "\n\n"]
    ele = doc.root.elements[1] # => <book category='cooking'> ... </>
    ele.index_in_parent # => 2
    ele = doc.root.elements[2]  # => <book category='children'> ... </>
    ele.index_in_parent# => 4

==== Siblings

[Next Element]

  Use method REXML::Element#next_element to retrieve the first following
  sibling that is itself an element (+nil+ if there is none):

    ele = doc.root.elements[1]
    while ele do
      p [ele.class, ele]
      ele = ele.next_element
    end
    p ele

  Output:

    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]

[Previous Element]

  Use method REXML::Element#previous_element to retrieve the first preceding
  sibling that is itself an element (+nil+ if there is none):

    ele = doc.root.elements[4]
    while ele do
      p [ele.class, ele]
      ele = ele.previous_element
    end
    p ele

  Output:

    [REXML::Element, <book category='web' cover='paperback'> ... </>]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Element, <book category='cooking'> ... </>]

[Next Node]

  Use included method REXML::Node.next_sibling_node
  (or its alias <tt>next_sibling</tt>) to retrieve the first following node
  regardless of its class:

    node = doc.root.children[0]
    while node do
      p [node.class, node]
      node = node.next_sibling
    end
    p node

  Output:

    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]
    [REXML::Text, "\n\n"]

[Previous Node]

  Use included method REXML::Node.previous_sibling_node
  (or its alias <tt>previous_sibling</tt>) to retrieve the first preceding node
  regardless of its class:

    node = doc.root.children[-1]
    while node do
      p [node.class, node]
      node = node.previous_sibling
    end
    p node

  Output:

    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Text, "\n\n"]

==== Children

[Child Count]

  Use inherited method REXML::Parent.size to retrieve the count
  of nodes (of all types) in the element:

    doc.root.size # => 9

[Child Nodes]

  Use inherited method REXML::Parent.children to retrieve an array
  of the child nodes (of all types):

    doc.root.children # =>
                      # ["\n\n",
                      #  <book category='cooking'> ... </>,
                      #  "\n\n",
                      #  <book category='children'> ... </>,
                      #  "\n\n",
                      #  <book category='web'> ... </>,
                      #  "\n\n",
                      #  <book category='web' cover='paperback'> ... </>,
                      #  "\n\n"]

[Child at Index]

  Use method REXML::Element#[] to retrieve the child at a given numerical index,
  or +nil+ if there is no such child:

    doc.root[0]  # => "\n\n"
    doc.root[1]  # => <book category='cooking'> ... </>
    doc.root[7]  # => <book category='web' cover='paperback'> ... </>
    doc.root[8]  # => "\n\n"

    doc.root[-1] # => "\n\n"
    doc.root[-2] # => <book category='web' cover='paperback'> ... </>

    doc.root[50] # => nil

[Index of Child]

  Use method REXML::Parent#index to retrieve the zero-based child index
  of the given object, or <tt>#size - 1</tt> if there is no such child:

    ele = doc.root     # => <bookstore> ... </>
    ele.index(ele[0])  # => 0
    ele.index(ele[1])  # => 1
    ele.index(ele[7])  # => 7
    ele.index(ele[8])  # => 8

    ele.index(ele[-1]) # => 8
    ele.index(ele[-2]) # => 7

    ele.index(ele[50]) # => 8

[Element Children]

  Use method REXML::Element#has_elements? to retrieve whether the element
  has element children:

    doc.root.has_elements?                  # => true
    REXML::Element.new('foo').has_elements? # => false

  Use method REXML::Element#elements to retrieve the REXML::Elements object
  containing the element children:

    eles = doc.root.elements
    eles      # => #<REXML::Elements:0x000001ee2848e960 @element=<bookstore> ... </>>
    eles.size # => 4
    eles.each {|e| p [e.class], e }

  Output:

    [<book category='cooking'> ... </>,
     <book category='children'> ... </>,
     <book category='web'> ... </>,
     <book category='web' cover='paperback'> ... </>
    ]

Note that while in this example, all the element children of the root element are
elements of the same name, <tt>'book'</tt>, that is not true of all documents;
a root element (or any other element) may have any mixture of child elements.

[CDATA Children]

  Use method REXML::Element#cdatas to retrieve a frozen array of CDATA children:

    my_xml = <<-EOT
      <root>
        <![CDATA[foo]]>
        <![CDATA[bar]]>
      </root>
    EOT
    my_doc = REXML::Document.new(my_xml)
    cdatas my_doc.root.cdatas
    cdatas.frozen?              # => true
    cdatas.map {|cd| cd.class } # => [REXML::CData, REXML::CData]

[Comment Children]

  Use method REXML::Element#comments to retrieve a frozen array of comment children:

    my_xml = <<-EOT
      <root>
        <!--foo-->
        <!--bar-->
      </root>
    EOT
    my_doc = REXML::Document.new(my_xml)
    comments = my_doc.root.comments
    comments.frozen?            # => true
    comments.map {|c| c.class } # => [REXML::Comment, REXML::Comment]
    comments.map {|c| c.to_s }  # => ["foo", "bar"]

[Processing Instruction Children]

  Use method REXML::Element#instructions to retrieve a frozen array
  of processing instruction children:

    my_xml = <<-EOT
      <root>
        <?target0 foo?>
        <?target1 bar?>
      </root>
    EOT
    my_doc = REXML::Document.new(my_xml)
    instrs = my_doc.root.instructions
    instrs.frozen?            # => true
    instrs.map {|i| i.class } # => [REXML::Instruction, REXML::Instruction]
    instrs.map {|i| i.to_s }  # => ["<?target0 foo?>", "<?target1 bar?>"]

[Text Children]

  Use method REXML::Element#has_text? to retrieve whether the element
  has text children:

    doc.root.has_text?                  # => true
    REXML::Element.new('foo').has_text? # => false

  Use method REXML::Element#texts to retrieve a frozen array of text children:

    my_xml = '<root><a/>text<b/>more<c/></root>'
    my_doc = REXML::Document.new(my_xml)
    texts = my_doc.root.texts
    texts.frozen?            # => true
    texts.map {|t| t.class } # => [REXML::Text, REXML::Text]
    texts.map {|t| t.to_s }  # => ["text", "more"]

[Parenthood]

  Use inherited method REXML::Parent.parent? to retrieve whether the element is a parent;
  always returns +true+; only REXML::Child#parent returns +false+.

     doc.root.parent? # => true

=== Element Attributes

Use method REXML::Element#has_attributes? to return whether the element
has attributes:

  ele = doc.root           # => <bookstore> ... </>
  ele.has_attributes?      # => false
  ele = ele.elements.first # => <book category='cooking'> ... </>
  ele.has_attributes?      # => true

Use method REXML::Element#attributes to return the hash
containing the attributes for the element.
Each hash key is a string attribute name;
each hash value is an REXML::Attribute object.

  ele = doc.root                  # => <bookstore> ... </>
  attrs = ele.attributes          # => {}

  ele = ele.elements.first        # => <book category='cooking'> ... </>
  attrs = ele.attributes          # => {"category"=>category='cooking'}
  attrs.size                      # => 1
  attr_name = attrs.keys.first    # => "category"
  attr_name.class                 # => String
  attr_value = attrs.values.first # => category='cooking'
  attr_value.class                # => REXML::Attribute

Use method REXML::Element#[] to retrieve the string value for a given attribute,
which may be given as either a string or a symbol:

  ele = doc.root.elements.first # => <book category='cooking'> ... </>
  attr_value = ele['category']  # => "cooking"
  attr_value.class              # => String
  ele['nosuch']                  # => nil

Use method REXML::Element#attribute to retrieve the value of a named attribute:

  my_xml = "<root xmlns:a='a' a:x='a:x' x='x'/>"
  my_doc = REXML::Document.new(my_xml)
  my_doc.root.attribute("x")      # => x='x'
  my_doc.root.attribute("x", "a") # => a:x='a:x'

== Whitespace

Use method REXML::Element#ignore_whitespace_nodes to determine whether
whitespace nodes were ignored when the XML was parsed;
returns +true+ if so, +nil+ otherwise.

Use method REXML::Element#whitespace to determine whether whitespace
is respected for the element; returns +true+ if so, +false+ otherwise.

== Namespaces

Use method REXML::Element#namespace to retrieve the string namespace URI
for the element, which may derive from one of its ancestors:

  xml_string = <<-EOT
    <root>
       <a xmlns='1' xmlns:y='2'>
         <b/>
         <c xmlns:z='3'/>
       </a>
    </root>
  EOT
  d = Document.new(xml_string)
  b = d.elements['//b']
  b.namespace      # => "1"
  b.namespace('y') # => "2"
  b.namespace('nosuch') # => nil

Use method REXML::Element#namespaces to retrieve a hash of all defined namespaces
in the element and its ancestors:

  xml_string = <<-EOT
    <root>
       <a xmlns:x='1' xmlns:y='2'>
         <b/>
         <c xmlns:z='3'/>
       </a>
    </root>
  EOT
  d = Document.new(xml_string)
  d.elements['//a'].namespaces # => {"x"=>"1", "y"=>"2"}
  d.elements['//b'].namespaces # => {"x"=>"1", "y"=>"2"}
  d.elements['//c'].namespaces # => {"x"=>"1", "y"=>"2", "z"=>"3"}

Use method REXML::Element#prefixes to retrieve an array of the string prefixes (names)
of all defined namespaces in the element and its ancestors:

  xml_string = <<-EOT
    <root>
       <a xmlns:x='1' xmlns:y='2'>
         <b/>
         <c xmlns:z='3'/>
       </a>
    </root>
  EOT
  d = Document.new(xml_string, {compress_whitespace: :all})
  d.elements['//a'].prefixes # => ["x", "y"]
  d.elements['//b'].prefixes # => ["x", "y"]
  d.elements['//c'].prefixes # => ["x", "y", "z"]

== Traversing

You can use certain methods to traverse children of the element.
Each child that meets given criteria is yielded to the given block.

[Traverse All Children]

  Use inherited method REXML::Parent#each (or its alias #each_child) to traverse
  all children of the element:

    doc.root.each {|child| p [child.class, child] }

  Output:

    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Text, "\n\n"]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]
    [REXML::Text, "\n\n"]

[Traverse Element Children]

  Use method REXML::Element#each_element to traverse only the element children
  of the element:

    doc.root.each_element {|e| p [e.class, e] }

  Output:

    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]

[Traverse Element Children with Attribute]

  Use method REXML::Element#each_element_with_attribute with the single argument
  +attr_name+ to traverse each element child that has the given attribute:

    my_doc = Document.new '<a><b id="1"/><c id="2"/><d id="1"/><e/></a>'
    my_doc.root.each_element_with_attribute('id') {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b id='1'/>]
    [REXML::Element, <c id='2'/>]
    [REXML::Element, <d id='1'/>]

  Use the same method with a second argument +value+ to traverse
  each element child element that has the given attribute and value:

    my_doc.root.each_element_with_attribute('id', '1') {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b id='1'/>]
    [REXML::Element, <d id='1'/>]

  Use the same method with a third argument +max+ to traverse
  no more than the given number of element children:

    my_doc.root.each_element_with_attribute('id', '1', 1) {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b id='1'/>]

  Use the same method with a fourth argument +xpath+ to traverse
  only those element children that match the given xpath:

    my_doc.root.each_element_with_attribute('id', '1', 2, '//d') {|e| p [e.class, e] }

  Output:

    [REXML::Element, <d id='1'/>]

[Traverse Element Children with Text]

  Use method REXML::Element#each_element_with_text with no arguments
  to traverse those element children that have text:

    my_doc = Document.new '<a><b>b</b><c>b</c><d>d</d><e/></a>'
    my_doc.root.each_element_with_text {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b> ... </>]
    [REXML::Element, <c> ... </>]
    [REXML::Element, <d> ... </>]

  Use the same method with the single argument +text+ to traverse
  those element children that have exactly that text:

    my_doc.root.each_element_with_text('b') {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b> ... </>]
    [REXML::Element, <c> ... </>]

  Use the same method with additional second argument +max+ to traverse
  no more than the given number of element children:

    my_doc.root.each_element_with_text('b', 1) {|e| p [e.class, e] }

  Output:

    [REXML::Element, <b> ... </>]

  Use the same method with additional third argument +xpath+ to traverse
  only those element children that also match the given xpath:

    my_doc.root.each_element_with_text('b', 2, '//c') {|e| p [e.class, e] }

  Output:

    [REXML::Element, <c> ... </>]

[Traverse Element Children's Indexes]

  Use inherited method REXML::Parent#each_index to traverse all children's indexes
  (not just those of element children):

    doc.root.each_index {|i| print i }

  Output:

    012345678

[Traverse Children Recursively]

  Use included method REXML::Node#each_recursive to traverse all children recursively:

    doc.root.each_recursive {|child| p [child.class, child] }

  Output:

    [REXML::Element, <book category='cooking'> ... </>]
    [REXML::Element, <title lang='en'> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <year> ... </>]
    [REXML::Element, <price> ... </>]
    [REXML::Element, <book category='children'> ... </>]
    [REXML::Element, <title lang='en'> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <year> ... </>]
    [REXML::Element, <price> ... </>]
    [REXML::Element, <book category='web'> ... </>]
    [REXML::Element, <title lang='en'> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <year> ... </>]
    [REXML::Element, <price> ... </>]
    [REXML::Element, <book category='web' cover='paperback'> ... </>]
    [REXML::Element, <title lang='en'> ... </>]
    [REXML::Element, <author> ... </>]
    [REXML::Element, <year> ... </>]
    [REXML::Element, <price> ... </>]

== Searching

You can use certain methods to search among the descendants of an element.

Use method REXML::Element#get_elements to retrieve all element children of the element
that match the given +xpath+:

  xml_string = <<-EOT
  <root>
    <a level='1'>
      <a level='2'/>
    </a>
  </root>
  EOT
  d = Document.new(xml_string)
  d.root.get_elements('//a') # => [<a level='1'> ... </>, <a level='2'/>]

Use method REXML::Element#get_text with no argument to retrieve the first text node
in the first child:

  my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
  text_node = my_doc.root.get_text
  text_node.class # => REXML::Text
  text_node.to_s  # => "some text "

Use the same method with argument +xpath+ to retrieve the first text node
in the first child that matches the xpath:

  my_doc.root.get_text(1) # => "this is bold!"

Use method REXML::Element#text with no argument to retrieve the text
from the first text node in the first child:

  my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
  text_node = my_doc.root.text
  text_node.class # => String
  text_node       # => "some text "

Use the same method with argument +xpath+ to retrieve the text from the first text node
in the first child that matches the xpath:

  my_doc.root.text(1) # => "this is bold!"

Use included method REXML::Node#find_first_recursive
to retrieve the first descendant element
for which the given block returns a truthy value, or +nil+ if none:

  doc.root.find_first_recursive do |ele|
    ele.name == 'price'
  end # => <price> ... </>
  doc.root.find_first_recursive do |ele|
    ele.name == 'nosuch'
  end # => nil

== Editing

=== Editing a Document

[Creating a Document]

  Create a new document with method REXML::Document::new:

    doc = Document.new(source_string)
    empty_doc = REXML::Document.new

[Adding to the Document]

  Add an XML declaration with method REXML::Document#add
  and an argument of type REXML::XMLDecl:

    my_doc = Document.new
    my_doc.xml_decl.to_s # => ""
    my_doc.add(XMLDecl.new('2.0'))
    my_doc.xml_decl.to_s # => "<?xml version='2.0'?>"

  Add a document type with method REXML::Document#add
  and an argument of type REXML::DocType:

    my_doc = Document.new
    my_doc.doctype.to_s # => ""
    my_doc.add(DocType.new('foo'))
    my_doc.doctype.to_s # => "<!DOCTYPE foo>"

  Add a node of any other REXML type with method REXML::Document#add and an argument
  that is not of type REXML::XMLDecl or REXML::DocType:

    my_doc = Document.new
    my_doc.add(Element.new('foo'))
    my_doc.to_s # => "<foo/>"

  Add an existing element as the root element with method REXML::Document#add_element:

    ele = Element.new('foo')
    my_doc = Document.new
    my_doc.add_element(ele)
    my_doc.root # => <foo/>

  Create and add an element as the root element with method REXML::Document#add_element:

    my_doc = Document.new
    my_doc.add_element('foo')
    my_doc.root # => <foo/>

=== Editing an Element

==== Creating an Element

Create a new element with method REXML::Element::new:

  ele = Element.new('foo') # => <foo/>

==== Setting Element Properties

Set the context for an element with method REXML::Element#context=
(see {Element Context}[../context_rdoc.html]):

  ele.context # => nil
  ele.context = {ignore_whitespace_nodes: :all}
  ele.context # => {:ignore_whitespace_nodes=>:all}

Set the parent for an element with inherited method REXML::Child#parent=

  ele.parent # => nil
  ele.parent = Element.new('bar')
  ele.parent # => <bar/>

Set the text for an element with method REXML::Element#text=:

  ele.text # => nil
  ele.text = 'bar'
  ele.text # => "bar"

==== Adding to an Element

Add a node as the last child with inherited method REXML::Parent#add (or its alias #push):

  ele = Element.new('foo') # => <foo/>
  ele.push(Text.new('bar'))
  ele.push(Element.new('baz'))
  ele.children # => ["bar", <baz/>]

Add a node as the first child with inherited method REXML::Parent#unshift:

  ele = Element.new('foo') # => <foo/>
  ele.unshift(Element.new('bar'))
  ele.unshift(Text.new('baz'))
  ele.children # => ["bar", <baz/>]

Add an element as the last child with method REXML::Element#add_element:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_element(Element.new('baz'))
  ele.children # => [<bar/>, <baz/>]

Add a text node as the last child with method REXML::Element#add_text:

  ele = Element.new('foo') # => <foo/>
  ele.add_text('bar')
  ele.add_text(Text.new('baz'))
  ele.children # => ["bar", "baz"]

Insert a node before a given node with method REXML::Parent#insert_before:

  ele = Element.new('foo') # => <foo/>
  ele.add_text('bar')
  ele.add_text(Text.new('baz'))
  ele.children    # => ["bar", "baz"]
  target = ele[1] # => "baz"
  ele.insert_before(target, Text.new('bat'))
  ele.children    # => ["bar", "bat", "baz"]

Insert a node after a given node with method REXML::Parent#insert_after:

  ele = Element.new('foo') # => <foo/>
  ele.add_text('bar')
  ele.add_text(Text.new('baz'))
  ele.children    # => ["bar", "baz"]
  target = ele[0] # => "bar"
  ele.insert_after(target, Text.new('bat'))
  ele.children    # => ["bar", "bat", "baz"]

Add an attribute with method REXML::Element#add_attribute:

  ele = Element.new('foo') # => <foo/>
  ele.add_attribute('bar', 'baz')
  ele.add_attribute(Attribute.new('bat', 'bam'))
  ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam'}

Add multiple attributes with method REXML::Element#add_attributes:

  ele = Element.new('foo') # => <foo/>
  ele.add_attributes({'bar' => 'baz', 'bat' => 'bam'})
  ele.add_attributes([['ban', 'bap'], ['bah', 'bad']])
  ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam', "ban"=>ban='bap', "bah"=>bah='bad'}

Add a namespace with method REXML::Element#add_namespace:

  ele = Element.new('foo') # => <foo/>
  ele.add_namespace('bar')
  ele.add_namespace('baz', 'bat')
  ele.namespaces # => {"xmlns"=>"bar", "baz"=>"bat"}

==== Deleting from an Element

Delete a specific child object with inherited method REXML::Parent#delete:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.children             # => [<bar/>, "baz"]
  target = ele[1]          # => "baz"
  ele.delete(target)       # => "baz"
  ele.children             # => [<bar/>]
  target = ele[0]          # => <baz/>
  ele.delete(target)       # => <baz/>
  ele.children             # => []

Delete a child at a specific index with inherited method REXML::Parent#delete_at:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.children             # => [<bar/>, "baz"]
  ele.delete_at(1)
  ele.children             # => [<bar/>]
  ele.delete_at(0)
  ele.children             # => []

Delete all children meeting a specified criterion with inherited method
REXML::Parent#delete_if:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children             # => [<bar/>, "baz", <bat/>, "bam"]
  ele.delete_if {|child| child.instance_of?(Text) }
  ele.children # => [<bar/>, <bat/>]

Delete an element at a specific 1-based index with method REXML::Element#delete_element:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children             # => [<bar/>, "baz", <bat/>, "bam"]
  ele.delete_element(2)    # => <bat/>
  ele.children             # => [<bar/>, "baz", "bam"]
  ele.delete_element(1)    # => <bar/>
  ele.children             # => ["baz", "bam"]

Delete a specific element with the same method:

  ele = Element.new('foo')   # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children               # => [<bar/>, "baz", <bat/>, "bam"]
  target = ele.elements[2]   # => <bat/>
  ele.delete_element(target) # => <bat/>
  ele.children               # => [<bar/>, "baz", "bam"]

Delete an element matching an xpath using the same method:

  ele = Element.new('foo')    # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children                # => [<bar/>, "baz", <bat/>, "bam"]
  ele.delete_element('./bat') # => <bat/>
  ele.children                # => [<bar/>, "baz", "bam"]
  ele.delete_element('./bar') # => <bar/>
  ele.children                # => ["baz", "bam"]

Delete an attribute by name with method REXML::Element#delete_attribute:

  ele = Element.new('foo') # => <foo/>
  ele.add_attributes({'bar' => 'baz', 'bam' => 'bat'})
  ele.attributes           # => {"bar"=>bar='baz', "bam"=>bam='bat'}
  ele.delete_attribute('bam')
  ele.attributes           # => {"bar"=>bar='baz'}

Delete a namespace with method REXML::Element#delete_namespace:

  ele = Element.new('foo') # => <foo/>
  ele.add_namespace('bar')
  ele.add_namespace('baz', 'bat')
  ele.namespaces           # => {"xmlns"=>"bar", "baz"=>"bat"}
  ele.delete_namespace('xmlns')
  ele.namespaces           # => {} # => {"baz"=>"bat"}
  ele.delete_namespace('baz')
  ele.namespaces # => {}   # => {}

Remove an element from its parent with inherited method REXML::Child#remove:

  ele = Element.new('foo')    # => <foo/>
  parent = Element.new('bar') # => <bar/>
  parent.add_element(ele)     # => <foo/>
  parent.children.size        # => 1
  ele.remove                  # => <foo/>
  parent.children.size        # => 0

==== Replacing Nodes

Replace the node at a given 0-based index with inherited method REXML::Parent#[]=:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children             # => [<bar/>, "baz", <bat/>, "bam"]
  ele[2] = Text.new('bad') # => "bad"
  ele.children             # => [<bar/>, "baz", "bad", "bam"]

Replace a given node with another node with inherited method REXML::Parent#replace_child:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children             # => [<bar/>, "baz", <bat/>, "bam"]
  target = ele[2]          # => <bat/>
  ele.replace_child(target, Text.new('bah'))
  ele.children             # => [<bar/>, "baz", "bah", "bam"]

Replace +self+ with a given node with inherited method REXML::Child#replace_with:

  ele = Element.new('foo') # => <foo/>
  ele.add_element('bar')
  ele.add_text('baz')
  ele.add_element('bat')
  ele.add_text('bam')
  ele.children             # => [<bar/>, "baz", <bat/>, "bam"]
  target = ele[2]          # => <bat/>
  target.replace_with(Text.new('bah'))
  ele.children             # => [<bar/>, "baz", "bah", "bam"]

=== Cloning

Create a shallow clone of an element with method REXML::Element#clone.
The clone contains the name and attributes, but not the parent or children:

  ele = Element.new('foo')
  ele.add_attributes({'bar' => 0, 'baz' => 1})
  ele.clone # => <foo bar='0' baz='1'/>

Create a shallow clone of a document with method REXML::Document#clone.
The XML declaration is copied; the document type and root element are not cloned:

  my_xml = '<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo><root/>'
  my_doc = Document.new(my_xml)
  clone_doc = my_doc.clone

  my_doc.xml_decl         # => <?xml ... ?>
  clone_doc.xml_decl      # => <?xml ... ?>

  my_doc.doctype.to_s     # => "<?xml version='1.0' encoding='UTF-8'?>"
  clone_doc.doctype.to_s  # => ""

  my_doc.root             # => <root/>
  clone_doc.root          # => nil

Create a deep clone of an element with inherited method REXML::Parent#deep_clone.
All nodes and attributes are copied:

  doc.to_s.size   # => 825
  clone  = doc.deep_clone
  clone.to_s.size # => 825

== Writing the Document

Write a document to an \IO stream (defaults to <tt>$stdout</tt>)
with method REXML::Document#write:

  doc.write

Output:

  <?xml version='1.0' encoding='UTF-8'?>
  <bookstore>

  <book category='cooking'>
    <title lang='en'>Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>

  <book category='children'>
    <title lang='en'>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>

  <book category='web'>
    <title lang='en'>XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
  </book>

  <book category='web' cover='paperback'>
    <title lang='en'>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>

  </bookstore>
